21 research outputs found
VLSlice: Interactive Vision-and-Language Slice Discovery
Recent work in vision-and-language demonstrates that large-scale pretraining
can learn generalizable models that are efficiently transferable to downstream
tasks. While this may improve dataset-scale aggregate metrics, analyzing
performance around hand-crafted subgroups targeting specific bias dimensions
reveals systemic undesirable behaviors. However, this subgroup analysis is
frequently stalled by annotation efforts, which require extensive time and
resources to collect the necessary data. Prior art attempts to automatically
discover subgroups to circumvent these constraints but typically leverages
model behavior on existing task-specific annotations and rapidly degrades on
more complex inputs beyond "tabular" data, none of which study
vision-and-language models. This paper presents VLSlice, an interactive system
enabling user-guided discovery of coherent representation-level subgroups with
consistent visiolinguistic behavior, denoted as vision-and-language slices,
from unlabeled image sets. We show that VLSlice enables users to quickly
generate diverse high-coherency slices in a user study (n=22) and release the
tool publicly.Comment: Conference paper at ICCV 2023. 17 pages, 11 figures.
https://ericslyman.com/vlslice
CNN 101: Interactive Visual Learning for Convolutional Neural Networks
The success of deep learning solving previously-thought hard problems has
inspired many non-experts to learn and understand this exciting technology.
However, it is often challenging for learners to take the first steps due to
the complexity of deep learning models. We present our ongoing work, CNN 101,
an interactive visualization system for explaining and teaching convolutional
neural networks. Through tightly integrated interactive views, CNN 101 offers
both overview and detailed descriptions of how a model works. Built using
modern web technologies, CNN 101 runs locally in users' web browsers without
requiring specialized hardware, broadening the public's education access to
modern deep learning techniques.Comment: CHI'20 Late-Breaking Work (April 25-30, 2020), 7 pages, 3 figure
Human-centered AI through scalable visual data analytics
While artificial intelligence (AI) has led to major breakthroughs in many domains, understanding machine learning models remains a fundamental challenge. How can we make AI more accessible and interpretable, or more broadly, human-centered, so that people can easily understand and effectively use these complex models? My dissertation addresses these fundamental and practical challenges in AI through a human-centered approach, by creating novel data visualization tools that are scalable, interactive, and easy to learn and to use. With such tools, users can better understand models by visually exploring how large input datasets affect the models and their results. Specifically, my dissertation focuses on three interrelated parts:
(1) Unified scalable interpretation: developing scalable visual analytics tools that help engineers interpret industry-scale deep learning models at both instance- and subset-level (e.g., ActiVis deployed by Facebook);
(2) Data-driven model auditing: designing visual data exploration tools that support discovery of insights through exploration of data groups over different analytics stages, such as model comparison (e.g., MLCube) and fairness auditing (e.g., FairVis); and (3) Learning complex models by experimentation: building interactive tools that broaden people's access to learning complex deep learning models (e.g., GAN Lab) and browsing raw datasets (e.g., ETable). My research has made significant impact to society and industry. The ActiVis system for interpreting deep learning models has been deployed on Facebook's machine learning platform. The GAN Lab tool for learning GANs has been open-sourced in collaboration with Google, with its demo used by more than 70,000 people from over 160 countries.Ph.D